SVM

Before moving forward with the to-do list, let’s throw a Random Forest to it.

SVM

For many reasons, Random Forest is usually a very good baseline model. In this particular case I started with the polynomial OLS as baseline model, just because it was so evident from the correlations that the relationship between temperature and consumption follows a polynomial shape. But let’s go back to a beloved RF.

/home/runner/work/strom/strom/.venv/lib/python3.10/site-packages/sklearn/svm/_base.py:1250: ConvergenceWarning:

Liblinear failed to converge, increase the number of iterations.

Model Cards provide a framework for transparent, responsible reporting. 
 Use the vetiver `.qmd` Quarto template as a place to start, 
 with vetiver.model_card()
Writing pin:
Name: 'wd-svm'
Version: 20251228T105330Z-ae806
♻️  stepit 'svm_raw': is up-to-date. Using cached result for `strom.modelling.assess_model()` 2025-12-28 10:53:30

Metrics

Single Split CV
train test test train
MAE - Mean Absolute Error 2.253863 2.404185 2.684980 3.410023
MSE - Mean Squared Error 15.655275 21.344099 12.038683 27.480901
RMSE - Root Mean Squared Error 3.956675 4.619967 2.982166 5.136139
R2 - Coefficient of Determination 0.832035 0.774003 -2.236684 0.722929
MAPE - Mean Absolute Percentage Error 0.226225 0.250558 0.395428 0.309699
EVS - Explained Variance Score 0.833088 0.775890 0.532616 0.818702
MeAE - Median Absolute Error 1.362887 1.408742 2.555536 2.625422
D2 - D2 Absolute Error Score 0.674623 0.662013 -0.765698 0.520175
Pinball - Mean Pinball Loss 1.126932 1.202092 1.342490 1.705011

Scatter plot matrix

Observed vs. Predicted and Residuals vs. Predicted

Check for …

check the residuals to assess the goodness of fit.

  • white noise or is there a pattern?
  • heteroscedasticity?
  • non-linearity?

Normality of Residuals:

Check for …

  • Are residuals normally distributed?

Leverage

Scale-Location plot

Residuals Autocorrelation Plot

Residuals vs Time

Well, not that bad, but it is overfitting quite a lot.

♻️  stepit 'grid_search_pipe': is up-to-date. Using cached result for `strom.modelling.grid_search_pipe()` 2025-12-28 10:53:34

Model Cards provide a framework for transparent, responsible reporting. 

 Use the vetiver `.qmd` Quarto template as a place to start, 

 with vetiver.model_card()

Writing pin:

Name: 'wd-svm'

Version: 20251228T105334Z-37e61
♻️  stepit 'svm_tuned': is up-to-date. Using cached result for `strom.modelling.assess_model()` 2025-12-28 10:53:34

Metrics

Single Split CV
train test test train
MAE - Mean Absolute Error 2.191867 2.337857 1.293039 2.436891
MSE - Mean Squared Error 15.723957 22.742093 2.980500 18.131235
RMSE - Root Mean Squared Error 3.965344 4.768867 1.645820 4.255542
R2 - Coefficient of Determination 0.831298 0.759200 0.090348 0.816841
MAPE - Mean Absolute Percentage Error 0.191461 0.199679 0.214997 0.194748
EVS - Explained Variance Score 0.832178 0.772519 0.505620 0.817841
MeAE - Median Absolute Error 1.216609 1.212842 1.094747 1.465526
D2 - D2 Absolute Error Score 0.683573 0.671338 0.149081 0.656816
Pinball - Mean Pinball Loss 1.095934 1.168928 0.646519 1.218445

Scatter plot matrix

Observed vs. Predicted and Residuals vs. Predicted

Check for …

check the residuals to assess the goodness of fit.

  • white noise or is there a pattern?
  • heteroscedasticity?
  • non-linearity?

Normality of Residuals:

Check for …

  • Are residuals normally distributed?

Leverage

Scale-Location plot

Residuals Autocorrelation Plot

Residuals vs Time

TODOs